A Parallel Boltzmann Machine on Distributed-Memory Multiprocessors
نویسنده
چکیده
In this paper, an efficient mapping scheme of Boltzmann Machine computations onto a distributed-memory mult@”r, which exploits the synchronous spatral parallelism, is plesentad. In this scheme, the neurons in a BoltPnann Machine are partitioned intop disjoint sets. and each set is mapped on a processor of ap -processor systan. A parallel convergence and leaming algorithms of Balk" Machines, necess~~y communication pattern among the p”. and their time complexities when neurons are partitioned and mapped onto a distributedrnemtny multiprocessar are investigated An expected p speed-up of the padelizing scheme over a single pcessm is also analyzed theoretically which can be used as a basis in detMnining the most costeffective or optimal number of processors according to the given communication capabilities and interconnection topologies. Introduction A B d t z ” Machine [3,7] is a pbabiliitic neural network model, in which the individual neurons probabilistically determine the next state upon the state of their neighbour neurons and the connection strength to them. It has been widely applied to the well-known combinatorial optimization p b l e m s [2,10] and image classification [3. One major problem in Bolt” Machine applications is that it requires huge amount of computational resowes, when the number of neurons in the Bolmann Machine becomes large. or when it is applied to a real-world application problem in which a long annealimg steps should be required to get a good solution. A general and natural way to speed-up the computations of Bdtzmann Machine is to use a multiptwesor, in which each processor usually takes charge of multiple neurons in the Boltzmsnn Machine and simulates them in a time-sharing fashion 1131. The basic parallelizing method for a Balk" Machine has been already proposed in [ 1, lo]. However. since they did not d d e r the overheads in the intaprocessor communication to exchange the changed neuron states, the actual pexfomance on a parallel canputer would be different from their simulation results if communication overheads are not negligible. Another pllnllelizing methods considering underlying parallel architecture were also proposed in [6,8]. In these w&, however, only a basic parallel convergence algoxithm and its appropriate interconnection topology were proposed without analysis in [a]. and it was highly depended on a specific hardware architecture (i.e., Transputer) [81. In this paper, we investigate a new paralklizing method for convergence and leaming algorithm of a synchronous Boltzma~ Machine on a parallel computer, especially DMM (Distributed-Memay Multi-). We follow our previous works [II, 121 on the ‘on model. in which the neurons in neural network are partitioned into p disjoint sets, and each set is mapped on a processor of ap-processor system. Ap-processor speed-up is also analyzed theoretically. Boltzmann Machine A B o l a ” Machine can be repesnted by a undirected graph G ={U$ } , where U = { u o , . . . ,um-l} denotes the set of m neurons and C is a set of unordered pairs from U denoting the connections between neurons [2]. A connection { ui .u j } E C connects the neucolls ui and U,. ‘Ibe consensus Ck denotes the overall des i i i l ity of all the activated connections in a given conIiguration k and is given by whete S(Ui .U j ) E R is the strength assigned to connection { U i , U j } , and ~ ( u i ) is the state of neuron ui in a configuration k . The difference in the consemus when the state of neuron ui is changed and the states of all other neurons is unchanged in configuration k , Ace@). is given by [2] : where C, denotes the set of connections incident with neuron ui ({ ui,ui} d C4). The probability A ~ u ~ c ) of accepting a state transition of neuron ui , given the configuration k , is chosen as where c denotes the value of the mtrol parameter ( c E R + ), and In the Boltzmann Machine leaming f-on, the set of neurons is divided into three disjoint subsets. Ui, u h , and U,. With U = UiyuhyU,. where ui, uh. and U, denote the sets of input., hidden. and o u w t neurons, respectively.Let luil =mi. IuhI =mh,and Iu,l =&. TheconnectionsoftheBoltzmannMachine,assumed in this paper, are chosen such that all input neurons are mutually ~ ~ ~ e c t e d , all output units are mutually connected, all hidden neurons are wnnected to all input andoutput neurons, andall n e m s have a biasconnection as shown in Figure l-(a). It is a commonly used connection pattern for the Boltzmann Machine when applied to classification is given by Eq.(2).
منابع مشابه
An efficient mapping of Boltzmann Machine computations onto distributed-memory multiprocessors
Oh, D.H., J.H. Nang, H. Yoon and S.R. Maeng, An efficient mapping of Boltzmann Machine computations onto distributedmemory multiprocessors, Microprocessing and Microprogramming 33 (1991/92) 223-236. In this paper, an efficient mapping scheme of Boltzmann Machine computations onto a distributed-memory multiprocessor, which exploits the synchronous spatial parallelism, is presented. In this schem...
متن کاملDelta Prolog: a Distributed Logic Programming Language and Its Implementation on Distributed Memory Multiprocessors
Delta Prolog is a logic programming language extending Prolog with constructs for sequential and parallel composition of goals, interprocess communication and synchronization , and external non-determinism. We present sequential and parallel search strategies for the language, based on the notion of derivations space. They rely upon distributed backtracking, a mechanism supporting the coordinat...
متن کاملParallel Solution of Sparse Linear Least Squares Problems on Distributed-Memory Multiprocessors
This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are anal...
متن کاملDynamic Scheduling on Distributed-Memory Multiprocessors
The problem of scheduling a set of applications on a multiprocessor system has been investigated in a number of diierent points of view. This paper describes our work on the scheduling problem at the user level, where we have to distribute evenly the parallel tasks that compose a program among a set of processors. We investigated dynamic scheduling heuristics applied to loops on distributed-mem...
متن کاملDistributed Shared Abstractions (dsa) on Multiprocessors
Any parallel program has abstractions that are shared by the program's multiple processes, including data structures containing shared data, code implementing operations like global sums or minima, type instances used for process synchronization or communication, etc. Such shared abstractions can considerably aaect the performance of parallel programs, on both distributed and shared memory mult...
متن کاملParallel Solution of Sparse Linear Least Squares Problemson Distributed - Memory
This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are anal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004